Automatic language identification using long short-term memory recurrent neural networks

نویسندگان

Javier Gonzalez-Dominguez

Ignacio Lopez-Moreno

Hasim Sak

Joaquín González-Rodríguez

Pedro J. Moreno

چکیده

This work explores the use of Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) for automatic language identification (LID). The use of RNNs is motivated by their better ability in modeling sequences with respect to feed forward networks used in previous works. We show that LSTM RNNs can effectively exploit temporal dependencies in acoustic data, learning relevant features for language discrimination purposes. The proposed approach is compared to baseline i-vector and feed forward Deep Neural Network (DNN) systems in the NIST Language Recognition Evaluation 2009 dataset. We show LSTM RNNs achieve better performance than our best DNN system with an order of magnitude fewer parameters. Further, the combination of the different systems leads to significant performance improvements (up to 28%).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks

Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources...

متن کامل

End-to-End Language Identification Using Attention-Based Recurrent Neural Networks

This paper proposes a novel attention-based recurrent neural network (RNN) to build an end-to-end automatic language identification (LID) system. Inspired by the success of attention mechanism on a range of sequence-to-sequence tasks, this work introduces the attention mechanism with long short term memory (LSTM) encoder to the sequence-to-tag LID task. This unified architecture extends the end...

متن کامل

A Step Beyond Local Observations with a Dialog Aware Bidirectional GRU Network for Spoken Language Understanding

Architectures of Recurrent Neural Networks (RNN) recently become a very popular choice for Spoken Language Understanding (SLU) problems; however, they represent a big family of different architectures that can furthermore be combined to form more complex neural networks. In this work, we compare different recurrent networks, such as simple Recurrent Neural Networks (RNN), Long Short-Term Memory...

متن کامل

Articulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings

Automatic prediction of articulatory movements from speech or text can be beneficial for many applications such as speech recognition and synthesis. A recent approach has reported stateof-the-art performance in speech-to-articulatory prediction using feed forward neural networks. In this paper, we investigate the feasibility of using bidirectional long short-term memory based recurrent neural n...

متن کامل

Sequence Modeling with Recurrent Tensor Networks

We introduce the recurrent tensor network, a recurrent neural network model that replaces the matrix-vector multiplications of a standard recurrent neural network with bilinear tensor products. We compare its performance against networks that employ long short-term memory (LSTM) networks. Our results demonstrate that using tensors to capture the interactions between network inputs and history c...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Automatic language identification using long short-term memory recurrent neural networks

نویسندگان

چکیده

منابع مشابه

Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks

End-to-End Language Identification Using Attention-Based Recurrent Neural Networks

A Step Beyond Local Observations with a Dialog Aware Bidirectional GRU Network for Spoken Language Understanding

Articulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings

Sequence Modeling with Recurrent Tensor Networks

عنوان ژورنال:

اشتراک گذاری